From Machine Readable Dictionaries to Lexicons for NLP: the Cobuild Dictionaries - a Different Approach
نویسندگان
چکیده
We describe the results of a syntactic-semantic parser for Cobuild dictionary definitions. Unlike previous work on the automatic analysis of machine readable dictionaries, the particular structure of the Cobuild definition allows us to derive information that classifies the lexical item mainly in terms of the selectional restrictions or preferences encoded on its arguments. The resulting formalized lexical entries contain data that has generally been lacking in other lexical representations but which is expected to be very useful in a wide range of NLP purposes. We show how this information can be used in dictionary sense disambiguation by creating links throughout the lexicon both on the paradigmatic and the syntagmatic axes.
منابع مشابه
NLP lexicons: innovative constructions and usages for machines and humans
Lexical resources have undergone significant changes with the generalized use of computers and the advent of the Internet. However, while such changes stand for revolutions when it comes to compare machine-readable dictionaries to their paper 'ancestors', machine-readable dictionaries, compiled for human readers, still have serious limitations. Natural language processing lexicons, initially de...
متن کاملITRI-95-19 MRDs, Standards and How To Do Lexical Engineering
Abstract How can you obtain a satisfactory lexicon for a modern NLP application? Ten years ago the answer you might have received was to wait for the large-scale lexicons soon to be derived from machine-readable dictionaries (MRDs). Five years ago the advice would have proclaimed the virtue of standards; these were currently being agreed, and once that was done, easy-to-use lexical resources co...
متن کاملBook Reviews: Machine Translation and the Lexicon
The practical success of machine translation (MT) depends on the ability to acquire, share, and manage lexical data. Rather than reinventing lexicons for each new system and application, it is preferable to leverage common lexical resources. Increasingly, researchers are using pre-existing resources such as machine-readable dictionaries (MRDs) and corpora to acquire lexicons and term banks for ...
متن کاملCOMPLEX: a computational lexicon for natural language systems
Although every natural language system needs a computational lexicon, each system puts different amounts and types of information into its lexicon according to its individual needs. However, some of the intonnation needed across systems is shared or "identical" information. This paper presents our experienc~" in planning and building COMPLEX, a computational lexicon designed to be a repository ...
متن کاملTowards machine-readable lexicons for South African Bantu languages
Lexical information for South African Bantu languages is not readily available in the form of machine-readable lexicons. At present the availability of lexical information is restricted to a variety of paper dictionaries. These dictionaries display considerable diversity in the organisation and representation of data. In order to proceed towards the development of reusable and suitably standard...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007